Expanding Opinion Lexicon with Domain Specific Opinion Words Using Semi-Supervised Approach
نویسندگان
چکیده
Opinion words as well as opinion phrases and idioms are very useful in sentiment analysis. All these terms together build opinion or sentiment lexicons. Therefore, opinion lexicons are large lists of terms that encode the sentiment of each phrase within it. Generally, to create such a lexicon automatically, high-precision classifiers use known sentiment vocabulary, e.g. the prior polarity of an adjective at word-level, to separate corresponding phrases from a non-annotated text collection. Most unsupervised approaches try to determine prior polarity, also called semantic orientation, of adjectives. However, adjective phrases or verb phrases are useful indicators of sentiment as well. To build domain independent opinion lexicons classifiers need to be applied to a high number of corpora regarding different text categories. This introduces the challenge of ambiguity, as opinion terms or phrases often show different sentiment when used in various sorts of texts. Therefore, a tradeoff which takes the most applicable sentiment in regards of a general domain has to be developed in such a case. In this paper we show a novel approach to extract domain specific adjectives from the Twitter corpus and expand the general lexicon. We build an undirected weighted graph of the adjective pairs, and use the weighted adjacency matrix as input of the clustering algorithm.
منابع مشابه
Opinion Word Expansion and Target Extraction through Double Propagation
Analysis of opinions, known as opinion mining or sentiment analysis, has attracted a great deal of attention recently due to many practical applications and challenging research problems. In this article, we study two important problems, namely, opinion lexicon expansion and opinion target extraction. Opinion targets (targets, for short) are entities and their attributes on which opinions have ...
متن کاملSentiment Analysis on Twitter through Topic-Based Lexicon Expansion
Supervised learning approaches are domain-dependent and it is costly to obtain labeled training data from different domains. Lexiconbased approaches enjoy stable performance across domains, but often cannot capture domain-dependent features. It is also hard for lexiconbased classifiers to identify the polarities of abbreviations and misspellings, which are common in short informal social text b...
متن کاملUsing Data Mining Techniques for Sentiment Shifter Identification
Sentiment shifters, i.e., words and expressions that can affect text polarity, play an important role in opinion mining. However, the limited ability of current automated opinion mining systems to handle shifters represents a major challenge. The majority of existing approaches rely on a manual list of shifters; few attempts have been made to automatically identify shifters in text. Most of the...
متن کاملOpinion Holder and Target Extraction based on the Induction of Verbal Categories
We present an approach for opinion role induction for verbal predicates. Our model rests on the assumption that opinion verbs can be divided into three different types where each type is associated with a characteristic mapping between semantic roles and opinion holders and targets. In several experiments, we demonstrate the relevance of those three categories for the task. We show that verbs c...
متن کاملAspect-Oriented Opinion Mining from User Reviews in Croatian
Aspect-oriented opinion mining aims to identify product aspects (features of products) about which opinion has been expressed in the text. We present an approach for aspect-oriented opinion mining from user reviews in Croatian. We propose methods for acquiring a domain-specific opinion lexicon, linking opinion clues to product aspects, and predicting polarity and rating of reviews. We show that...
متن کامل